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Abstract 

Background: Emergency department (ED) crowding is just beginning to be quantified. The only two scales 
presently available are the National Emergency Department Overcrowding Scale (NEDOCS) and the 
Emergency Department Work Index (EDWIN). 

Objectives: To assess the value of the NEDOCS and the EDWIN in predicting overcrowding. The hypo- 
thesis of this study was that the NEDOCS and the EDWIN would be equally sensitive and specific for 
overcrowding. 

Methods: The NEDOCS, the EDWIN, and an overcrowding measure (OV) were determined every two 
hours for a ten-day period in December 2004. The NEDOCS is a statistically derived calculation, and the 
EDWIN is a formula-based calculation. The overcrowding measure is a composite of physician and charge 
nurse expert opinion on the degree of overcrowding as measured on a 100-mm visual analogue scale 
(VAS). The primary outcome, overcrowding, was based on the dichotomized OV VAS score at the midpoint 
of 50 mm (>50, overcrowded; <50, not overcrowded). The area under the receiver operator characteristic 
curve (AUC) and an index of adequacy (relative prognostic content) of each measure, on the basis of the 
likelihood ratio chi-square statistic, were computed to evaluate the performance of NEDOCS and EDWIN. 
Results: There were 130 completed sampling times over ten days. The OV indicated that the ED was over- 
crowded 62% of the time. The AUC for the NEDOCS was 0.83 (95% CI = 0.75 to 0.90), and the AUC for the 
EDWIN was 0.80 (95% CI = 0.73 to 0.88). The NEDOCS score accounts for 97% of the prognostic informa- 
tion provided by combining all variables used in each model into one combined model. The EDWIN score 
accounts for only 86% (x 2 test for difference, p = 0.02). 

Conclusions: Both scales had high AUCs, correlated well with each other, and showed good discrimination 
for predicting ED overcrowding. This establishes construct validity for these scales as measures of over- 
crowding. Which scale is used in an ED is dependent on which set of data is most readily available, with 
the favored scale being the NEDOCS. 
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In 2004, Hwang and Concato stated, "Although 
emergency department (ED) overcrowding has 
been a topic of frequent investigation, current defini- 
tions of the problem are often implicit or focus on factors 
outside of the ED itself. A more consistent approach 
to defining ED overcrowding would help to clarify 
the distinctions between causes, characteristics, and 
outcomes." 1 In the past there has been no standardiza- 
tion and no generalizable definition of overcrowding. 
Even the American College of Emergency Physicians' 
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definition is difficult to apply across different EDs with 
differing problems. 

Emergency department overcrowding is only begin- 
ning to be studied in ways that allow for standardized 
quantification of the problem. 2-5 Methods of quantifica- 
tion have been based on different theoretical constructs, 
but all use emergency providers' opinions as an outcome 
variable. The two primary scales available at present 
for quantifying overcrowding are the NEDOCS 5 and 
the EDWIN model. 3 Although each was developed by 
using different methodology, both attempt to model the 
same outcome variable of real-time expert opinion on 
ED overcrowding. 

The purposes of this study were to determine the valid- 
ity of the NEDOCS and EDWIN in comparison to ED 
expert opinion on overcrowding, to determine whether 
there was a similar construct validity represented by 
both scales, and to evaluate the two scales in comparison 
to each other as measures of overcrowding. 

METHODS 



Study Design 

This was a prospective study of the assessment mecha- 
nisms for ED overcrowding. The study did not involve 
any patient contact, and all identifiers were removed 
from the information as it was obtained. The institutional 
review board approved this study as exempt from in- 
formed consent requirements. 

Study Setting and Population 

The study was performed in an inner-city, Level 3 trauma 
center with an ED patient census of >60,000 per year. 

Study Protocol 

An overcrowding measure (OV) was calculated on the 
basis of the method used by the NEDOCS investigators. 
Our ED always has two physicians and one charge nurse 
covering the unit at all times. At sampling times, both ED 
physicians and the ED charge nurse were asked to rate 
the degree of overcrowding on a 100-mm visual ana- 
logue scale (VAS). Results were combined into a compos- 
ite outcome score by taking the average of the physician 
and nurse scores. The primary outcome, overcrowding, 
was based on the OV VAS score being above or below 
the midpoint of the VAS (overcrowded, >50 mm; not 
overcrowded, <50 mm). 

The NEDOCS Model. The first model evaluated was the 
NEDOCS. The criteria for NEDOCS variables were 
the following: 1) represented a snapshot of the ED, 2) 
represented an aspect of ED patient management (for 
example, triage, treatment, and disposition), 3) was read- 
ily available, 4) was definable such that results were 
reproducible between observers, and 5) was consistently 
defined between institutions. The variable of patient acu- 
ity, the definition of which varies greatly between insti- 
tutions, was not used. The NEDOCS was created in a 
stepwise fashion, leading to a reduced model of five var- 
iables. The reduced model of overcrowding includes the 
following items: 1) ED patients (indexed to ED beds), 2) 
number of ventilators in use in the ED, 3) longest admit 
time, 4) waiting room time for the last patient called to 



a bed, and 5) indexed admits in the ED (indexed to hos- 
pital beds). These items were entered into a developed 
algorithm and yielded a score between 1 and 200, with 
less than 100 considered not overcrowded and more 
than or equal to 100 considered overcrowded. Within 
this spectrum, six categories exist, from not busy to dan- 
gerously overcrowded. The NEDOCS was used exactly as 
described in earlier publications. 5,6 

The EDWIN Model. EDWIN is defined as 
N a (B T - B A )' 

where n; = number of patients in the ED in the triage cate- 
gory i, tj = triage category, N a = the number of attending 
physicians on duty, B T = the number of treatment bays, 
and B A = the number of admitted patients in the ED. 3 The tri- 
age system used was the Emergency Severity Index (ESI), a 
five-level instrument that has high interobserver agreement 
and is associated with resource use and hospitalization 
rates. 7 

Sampling Methods. This study calculated both the 
NEDOCS and the EDWIN scores every two hours for a 
ten-day period. All values for the EDWIN model were 
available for download from our computerized triage sys- 
tem. For the NEDOCS, the computerized triage system 
easily presented the number of patients and the num- 
ber of admissions. For the calculation of ED wait and 
admit times, computerized system snapshots were down- 
loaded every 5 minutes, which allowed us to determine 
these times within a 5-minute time frame. We obtained 
the number of respirator patients in the ED from an 
attending physician. The outcome variable was obtained 
by asking the ED attendings and the charge nurse to rate 
independently the level of ED crowding. The VAS was a 
100-mm line that used six-point Likert levels similar to 
those used to validate the NEDOC S . The Likert scale corre- 
sponding to levels of overcrowding was shown to all par- 
ticipants, next to the VAS. Neither of the two scales had 
been derived previously or validated in this particular ED. 

All data necessary for the ED overcrowding scales 
were readily available. Results were calculated for each 
scale on the basis of published information. 

Data Analysis 

Descriptive statistics were used to characterize the 
sampling times. Pearson correlation coefficients were 
calculated for testing associations among the two over- 
crowding scales and the composite OV. Our primary out- 
come was overcrowding, yes/no. Overcrowding was set 
to yes if the OV VAS score was at least 50 mm. Otherwise 
it was set to no. Receiver operator characteristic (ROC) 
curves were used to compare the association between 
sensitivity and specificity of an overcrowding scale for 
various cut points, thus allowing the determination of 
an optimal cut point. To determine the predictive ability 
of the overcrowding scales, three logistic regression 
models were used to predict overcrowding from NE- 
DOCS alone, EDWIN alone, and both scales. Predictive 
discrimination of the scales was determined by using 
the C-statistic, which is a generalization of the area under 
the ROC curve (AUC). An AUC or C-statistic of 1.0 
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indicates perfect predictive discrimination, and an AUC 
or C-statistic of 0.50 indicates that a test that does not dis- 
criminate between overcrowding and no overcrowding. 
An AUC or C-statistic of at least 0.80 is considered to 
have good discrimination. 8 Because a model based on 
data obtainable from one of the scales would simplify 
data collection, we compared the performance of each 
model with a combined model consisting of all of the var- 
iables used in both scales. The comparison was made by 
using likelihood ratio (LR) chi-square statistics, which are 
sensitive measures of model fit. A measure of adequacy 
was determined on the basis of the ratio of the LR statis- 
tics with a scale alone, compared with the overall model 
likelihood ratio statistics. This is a unitless index of ade- 
quacy of a subset of predictors, or here, of the individual 
scales. To compare the superiority of NEDOCS over 
EDWIN and vice versa, a single-factor chi-square test 
was used by comparing the LR chi-square statistics of 
the models with each scale alone. 9 This LR model is 
well cited statistically and has been used elsewhere to 
compare study measurements. 10,11 

RESULTS 



There were 131 sampling times over ten days. Only one 
sampling time was missed during that period, for a total 
of 130 completed sampling times. The VAS range for the 
outcome variable was 4—91 mm with a mean (±SD) of 51 
(±24.6). The median VAS was 56, and the interquartile 
range (IQR) was 29 to 72 mm. The OV indicated that 
the ED was overcrowded in 80 (62%) of the 130 sampling 
times. 

The median score for the NEDOCS was 93 (IQR: 72, 
112), with a mean (±SD) score of 91 (±28.8) and a range 
of 31 to 144. The median score for EDWIN was 1.54 
(IQR: 1.33, 1.83), the mean score was 1.58 (±0.43), and 
the range was 0.60 to 2.62. 

The Pearson correlation coefficient for testing the 
association between NEDOCS and OV was 0.71, and 
for EDWIN and OV it was 0.74 (both, p < 0.001). EDWIN 
and NEDOCS results are highly correlated with each 
other and with the expert opinion on overcrowding, as 
shown in Figure 1 (r = 0.84; p < 0.001). 

When the EDWIN and the NEDOCS were compared 
with the overcrowding variable, ROC curves were rea- 
sonably similar (Figure 2). The AUC for the NEDOCS 
was 0.83 (95% CI = 0.75 to 0.90), and the AUC for the 
EDWIN was 0.80 (95% CI = 0.73 to 0.88). Cutoffs were de- 
termined for both scales on the basis of a sensitivity of 
80%. A cutoff of 87 on the NEDOCS was 80% sensitive 
and 71% specific, whereas a cutoff of 1.40 on the EDWIN 
was 80% sensitive and 63% specific for overcrowding. 

The C-statistic for the logistic regression model for 
NEDOCS was 0.83. The IQR odds ratio, comparing the 
75th percentile with the 25th percentile (112 vs. 73 NE- 
DOCS), was 8.61 (95% CI = 3.99 to 18.54). The C-statistic 
for the logistic regression model for EDWIN was 0.81. 
The IQR odds ratio, comparing the 75th percentile with 
the 25th percentile (1.83 vs. 1.33 EDWIN), was 6.61 
(95% CI = 3.17 to 13.79). 

The LR chi-square statistics for NEDOCS and EDWIN 
alone, and a combined model containing both the NE- 
DOCS and EDWIN, are given in Table 1. For EDWIN, 
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Figure 1. Comparison of results of the Emergency Depart- 
ment Work Index (EDWIN), the National Emergency Depart- 
ment Overcrowding Scale (NEDOCS), and overcrowding 
expert opinion. Correlation coefficients (r) for the compari- 
son are 0.71 (top), 0.75 (middle), and 0.84 (bottom). 



the adequacy with respect to predicting overcrowding, 
compared with the combined model, was 86% (43.13/ 
50.14). For NEDOCS, the adequacy was 97% (48.56/ 
50.04). NEDOCS was found to add significantly more 
than EDWIN to the combined model (x 2 test = 5.43; 
p = 0.02). 
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1 - Specificity 

Figure 2. Receiver-operator curves for the National Emer- 
gency Department Overcrowding Scale (NEDOCS) and the 
Emergency Department Work Index (EDWIN). Area under 
the curve is 0.83 for the NEDOCS and 0.80 for the EDWIN. 



DISCUSSION 

Emergency department overcrowding has been related 
to multiple adverse outcomes. The patient's length of 
stay increases, thus increasing the number of patients 
who leave the ED without being seen, 1215 some of 
whom may have diseases that are as much of a significant 
medical emergency as those of patients who stay to be 
seen. 12,13 ' 15-21 Among those who stay to be seen, ED 
overcrowding affects quality of care 22 1-25 and patient sat- 
isfaction. 26-31 Inadequate patient care leads to medical 
errors as an ED becomes more overcrowded. 32 ' 36 Some 
of these errors can be severe enough to lead to death 
and disability. 37 For all of these reasons, EDs must 
continue to work through administrative and political 
channels to prevent overcrowding. 

In this study, we have attempted to improve our under- 
standing of the construct of overcrowding. Not only are 
both scales evaluated here well correlated to a standard- 
ized ED overcrowding outcome variable, but they also 
are well correlated with each other. This study acts both 
as a prospective validation of the two scales and as an 
affirmation of their construct validity. The NEDOCS and 
EDWIN scales also had high AUCs and showed excellent 
sensitivity and specificity for ED overcrowding. 

This study demonstrates a superiority of the NEDOCS 
when compared with the EDWIN in measuring over- 



Table 1 

Adequacy* of the NEDOCS and EDWIN 

Scales Chi-Square Adequacy (%) 

EDWIN 43.13 86 

NEDOCS 48.56 97 

Combined 50.04 100 

•Adequacy in reflecting the results of a model consisting of all of the 



crowding. The difference between the scales is relatively 
small, and often conditions in the ED will dictate which of 
these scales is a better choice. For example, the EDWIN 
is based on the ESI; if the ESI triage scoring system is 
not used in a particular ED, an EDWIN score cannot 
be calculated. Alternatively, if times from registration 
and from admission decisions are hard to obtain, the 
NEDOCS score cannot be calculated. Each ED must 
therefore determine which scale can be best applied. 

The NEDOCS and the EDWIN both have face validity. 
The NEDOCS was designed on the basis of expert input 
from eight ED sites nationwide and was developed sta- 
tistically by reducing a 20-question model to the best 
5 questions. Face validity of the EDWIN is based on an 
intuitive understanding of ED overcrowding. Both scales 
have helped to advance the science of overcrowding. 
Research can now begin to evaluate the effect of ED 
systems changes on adverse outcomes by using the 
quantitative variables reflecting overcrowding. 

LIMITATIONS 



Emergency department overcrowding research is limited 
by the problem of a soft criterion standard. Both over- 
crowding scales studied here were designed and validated 
on the basis of the standard of emergency providers' 
opinion of overcrowding. 3,5 Although not the perfect 
criterion standard, ED provider opinion appears to be a 
consistent marker for the construct of overcrowding. It 
is as clear as many other criterion standards that have 
been used in other research areas of emergency medicine, 
such as in pulmonary embolism studies, in which the cri- 
terion standard constantly is changing. 38 ^ 0 ED expert 
opinion is the only logical starting point for development 
of quantitative measures of overcrowding. 

Using a VAS to determine expert opinion has some 
drawbacks. Overcrowding is not necessarily a continu- 
ous phenomenon. Both NEDOCS and EDWIN were 
derived by using Likert-like scales. The use of a continu- 
ous scale, with markings in proximity to the line, ap- 
peared more appropriate to fairly reflect overcrowding 
and allow comparison of the two scales. 

This study was performed at a single academic ED. An 
important factor in the applicability of our results is that 
neither the NEDOCS nor the EDWIN were derived in our 
ED. Therefore, we believe that our results can be gener- 
alized to other EDs for both of the scales, thus increasing 
the validity of the results. 

Another limitation is that constructs used to model 
complex behavior must be tested in almost every possible 
type of setting before they are determined to be general- 
izable and valid. Neither of these two scales has yet been 
tested that extensively. The NEDOCS has been validated 
in numerous busy academic centers, whereas the ED- 
WIN has been validated in one academic center. This 
study represents one further step in the process of vali- 
dating these scales. 

CONCLUSIONS 



Both scales had high AUCs, correlated well with each 
other, and showed good discrimination for predicting 
ED overcrowding. This establishes construct validity for 
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these scales as measures of overcrowding. Which scale 
is used in an ED is dependent on which set of data is 
most readily available, with the favored scale being the 
NEDOCS. 
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